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A nnajor challenge in Indexing unstructured hypertext databases is to automatically extract 
nrieta-data that enables structured search using topic taxonomies, circumvents keyword 
ambiguity, and improves the quality of search and profile-based routing and filtering. 
Therefore, an accurate classifier is an essential component of a hypertext database. 
Hyperlinks pose new problems not addressed in the extensive text classification literature. 
Links clearly contain high-quality semantic clues that ... 

2 A practical hypertext cateraorization method using links and incrennentally available 
class information 

Hyo-Jung Oh, Sung Hyon Myaeng, Mann-Ho Lee 

July 2000 Proceedings of the 23rd annual international ACM SIGIR conference on 
Research and development in information retrieval 

Full text available* ^ pdf(674 31 KB) Additional Information: full citation , abstract , references , citings , index 

terms 

As WWW grows at an increasing speed, a classifier targeted at hypertext has become in 
high demand. While document categorization is quite a mature, the issue of utilizing 
hypertext structure and hyperlinks has been relatively unexplored. In this paper, we 
propose a practical method for enhancing both the speed and the quality of hypertext 
categorization using hyperlinks. In comparison against a recently proposed technique that 
appears to be the only one of the kind, we obtained up to 18.5% of ... 

Keywords: text categorization 
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August 2004 Pr ceedings of the fifteenth ACM c nference on Hypertext & hypermedia 
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An important issue with the Web is verification of the accuracy, currency and authenticity of 
the information associated with Web sites. One way to address this problem is to identify 
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Image Retrieval from the World Wide Web: Issues, Techniques, and Systems 

M. L. Kherfi, D. Ziou, A. Bernard! 

March 2004 ACM Computing Surveys (CSUR), volume 36 issue i 

Full text available: gpdf(294.13 KB) Additional Information: full citation , abstract , references , index terms 

With the explosive growth of the World Wide Web, the public Is gaining access to nnassive 
amounts of information. However, locating needed and relevant Information remains a 
difficult task, whether the information is textual or visual. Text search engines have existed 
for some years now and have achieved a certain degree of success. However, despite the 
large number of images available on the Web, image search engines are still rare. In this 
article, we show that in order to allow people to profi ... 

Keywords: Image-retrieval, World Wide Web, crawling, feature extraction and selection, 
indexing, relevance feedback, search, similarity 



2 Information retrieval on the web 

Mei KobayashI, Koichi Takeda 

June 2000 ACM Computing Surveys (CSUR), Volume 32 issue 2 



In this paper we review studies of the growth of the Internet and technologies that are 
useful for information search and retrieval on the Web. We present data on the Internet 
from several different sources, e.g., current as well as projected number of users, hosts, 
and Web sites. Although numerical figures vary, overall trends cited by the sources are 
consistent and point to exponential growth in the past and In the coming decade. Hence it is 
not surprising that about 85% of Internet user ... 

Keywords: Internet, World Wide Web, clustering, indexing, information retrieval, 
knowledge management, search engine 
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